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' Abstract 

^ . We consider branching random walks built on Galton- Watson trees with offspring distribu- 

tion having a bounded support, conditioned to have n nodes, and their rescaled convergences 
to the Brownian snake. We exhibit a notion of "globally centered discrete snake" that extends 
the usual settings in which the displacements are supposed centered. We show that under some 
additional moment conditions, when n goes to +00, "globally centered discrete snakes" converge 
p ^ . to the Brownian snake. The proof relies on a precise study of the "lineage" of the nodes in a 

(-H , Galton- Watson tree conditioned by the size, and their links with a multinomial process. Some 

' consequences concerning Galton- Watson trees conditioned by the size are also derived. 

a ■ 
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1.1 A model of centered discrete snake 

We first begin with the formal description of the notion of trees and branching random walks. 
Let Too = {0} U U„>i N*" be the set of finite words on the alphabet N* = {1,2,... }. For 
u = ui . . .Un, and v = vi . . . Vm S T^c, we let uv = ui . . . UnVi . . . f m be the concatenation of the 
words u and v (by convention 0u = u0 = u). Following Neveu, we call planar tree T a subset 
of Too containing the root 0, and such that if ui £ T, then u £ T and for all j £ [l,i]], uj £ T. 
H ' The elements of a tree are called nodes or vertices. For i ^ j, the nodes ui and uj are called 
brothers and u their father. We let Cu{T) = max{i : ui £ be the number of children of u. A 
node without any child is called a leaf, and we denote by dT the set of leaves of T. If t; 7^ 0, we 
say that uv is a descendant of u and u is an ancestor of uv. An edge is a pair {u, where u is 
the father of v. A path [u, v\ between the nodes u and f in a tree T is the (minimal) sequence 
of nodes u := uo,...,Uj := v such that for any i £ [[0,j — Ij, {Mj,iij+i} is an edge. Set also 
}u, vl= {u, v} \ {u, v} and similar notation for {u, u [[ and for }u, vj. The distance cIt, or simply d, is 
the usual graph distance. The depth of u is |n| = d{0,u). The cardinality of T is denoted by |r|, 
and we let T (resp. 7^) be the set of planar trees (resp. with n edges, i.e. n + 1 vertices). 

A branching walk is a pair (T, i) where T is a tree called the underlying tree and I, the label 
function, is an application from T taking its values in M. In other words it is a tree in which every 
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vertex owns a real label. We let B be the set of branching walks, and Bn be the branching walks 
associated with trees from 7^. 

We introduce now some randomness and construct a probability distribution on B and on Bn- 
The set of underlying trees is endowed with the distribution of the family tree of a Galton- Watson 
(GW) process with offspring distribution ii = {^k)k>Q starting from one individual. Wc denote 
by T a random tree under this distribution. The distribution of the labels is defined as follows. 
Consider {yk)k&{i,2,...} a family of distributions, where ffc is a distribution on M'^. The labels are 
defined conditionally on the underlying tree T : Set £(0) = 0, and for any n G T \ 5T, consider 

X„ := - . . . , l{uCu{T)) - i{u)) , 

the evolution- vector of the labels between u and its children. Conditionally on T, we assume that 
the r.v. Xu are independent, and that X„ has distribution fc„(T)- This determines a distribution 
on B, denoted by P. For example, if is the uniform distribution on {—1,+!}'^ for any k > 
0, then the r.v. £{ul) — i{u) , . . . , i{uCu{T)) — i{u) are independent with common distribution 
^((5+1 + {Sx stands for the Dirac mass at x). In the case where U]^ is the vmiform distribution 
on {(1, . . . , A;), (— 1, . . . , — fe)}, the r.v. £{ui) — £{u) and i{uj) — £{u) are not independent and do not 
have the same distribution. 

We define now two sets of assumptions (Hi) and (H2) that will be assumed to be satisfied in 
most of our results. (Hi) is the conditions that /i is non-degenerate critical and has a bounded 
support : 

(Hi) := (^fiQ + fii ^ 1, ^ k/ik = 1, there exists K > s.t. ^ = 1^ 

k>0 k<K 

Under (Hi) the variance cr^ of ^ is finite and non zero. The bounded support condition is quite a 
strong restriction but considering non-bounded distribution leads to non-trivial complications, and 
we were unable to extend to that case the most important results. 

Let Y^^^ = (Ifc^i, . . . , ^fc,A:) be i/j^-distributed, and let mi^j and j be the mean and the variance 
of Yfej. We call global mean and global variance of the branching random walk, 

k k 

m = ^ ^ /ifcmfe J , and Z?^ = XI XI l^kH^kj ) • 

fc>l j=l k>l 3=1 

Let (H2) denote the conditions that the global mean is null, the global variance finite, and for a 
p > 4, the centered j3th moment of the i^fcj's are finite: 



(H2) 



m = and /? G (0, +00), 

there exist p > 4 s.t. for any (A;, j), 1 < j < A; < K, E (|Yj.j — m^jl^) < -|-oo. 
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Figure 1: A tree on which is indicated the depth first traversal, its height and contour process. 



Encoding of branching random walks 

We study the asymptotic behavior of branching random walks via their encoding by depth first 
traversal. The depth-first traversal of a tree T G 7^ is a function: 

Ft : {0, 2n} { vertices of T }, 

which we regard as a walk around T, as follows: -Ft(O) = 0, and given Ft{i) = z, choose if possible 
and according to the LO, the smallest child w oi z which has not already been visited, and set 
Fxii + 1) = w. If not possible, let F^ii + 1) be the father of z. 

We also denote by ^ the lexicographical order (LO) on the planar trees (and u ~i v \i u ^ v 
and u ^ v), and let u{k) be the A;-th vertex in the LO (n(0) = 0). 

We now encode the branching random walk with the help of a pair of processes. For any 
k £ 10, \T - let Hi = \u{k)\ and = l{u{k)). The height process (i?J,s G [0, |r - 1|]) 
and head label process {R^,s e [0, \T — 1|]) are obtained from the sequences (-ffj) and (-R^) by 
linear interpolation. Alternatively, one may encode the branching random walk with a pair of 
processes associated with the depth first traversal: for any k G [0,2|r| — Ij, let H^{k) = \FT{k)\ 
and Rl = £{FT{k)). The processes (^J,s G [0,2|r| - 1]) and {R'^,s G [0,2|r| - 1]), obtained by 
interpolation, are called respectively the contour process and the contour label process; the pair 
{H^ , RF) is called the head of the discrete snake. 

Let h„, h^, Yn and r„ be the normalized versions of , H^,R^, and R^ when T is P^i- 
distributed : 

h„(.) = h„(.) = r„(.) = ?„(s) = %, for any . G [0, 1]. 

Let d := gcd{k,k > l,fik > 0}. The support of the distribution of |T| - we write supp(|T|) - is 
included in 1 + dN (and P(|T| = 1 + kd) > for every k large enough). For n + 1 G supp(|T|), 
the distribution P under the conditioning by |T| = n + 1 is denoted by P„, in other words P„ = 
P( . I |T| = n + 1). Even if not recalled, each statement concerning weak convergence under P„ is 
assumed to be along the subsequence {nk)k for which P„j. is well defined. In the proofs, we will 
treat only the case d = 1, the general case being treated with slight modifications. 
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Figure 2: A branching random wall< from Bg. On the first column, the contour process and the 
contour label process, on the second column, the height process and the height label process. 

Theorem 1 // (Hi) and (H2) are satisfied then 



in C([0, endowed with the topology of uniform convergence, where h = 2e/cr^ and e is the 
normalized Brownian excursion, and where conditionally on h, r is a centered Gaussian process 
with covariance function 



Notice that the same processes h and r appear twice in the limit process. The convergence of 
processes associated with the contour processes (with a ^) to the same hmit as the one associated 
with the height processes is weh understood now, and "almost" generic (Duquesne &; Le Gall [9, 
Section 2.5] and [20]). In Section 2.7, wc prove that we may concentrate only of the height process, 
as done in this paper. The process (r, h) (or with a different scaling) is called in the literature head 
of the Brownian snake with lifetime process the normalized Brownian excursion (BSBE). We refer 
to the works of Lc Gall (e.g. [16] and with Duquesne [10]) for information on the Brownian snake. 

In this paper, wc deal only with the head of the snake, and not precisely in term of snakes, even 
if, thanks to the homeomorphism theorem [20], evoked below. Theorem 1 has some applications in 
term of snakes. We refer to [20, 13] for the notion of discrete snake which is the discrete analogue 
of BSBE : the discrete snake associated with the branching random walk (T, i), is the pair {H'^, $) 
where $ = {^k)kelQ,2\T-i\} ^'^d is the sequence of labels on the branch |0,Ft(A;)]. The title of 
the present paper is then taken from our model of snake under (H2) in which the global mean is 0. 

Related works 

The convergence h„ h is due to Aldous [1, 2] (see also Marckert &; Mokkadem [21] for a 

n 

revisited proof, Pitman [24, Ghap. 5 and 6], and Duquesne [9] and Duquesne & Le Gall [10, section 




cov(r(s),r(t)) =h(s,t) := 



min h('u), for any s,t E [0, 1]. 

ue[sAt,sVt] 



4 



2.5] for generalization to GW trees with offspring distribution having infinite variance). 

The two first results concerning the convergence of discrete snakes to the BSBE appeared in 
two independent works : 

• Chassaing & Schaeffer [7] deal with discrete snakes built on underlying trees chosen uniformly 
in %i (this corresponds to the case where /x ~ Geom{l/2)) and where the displacements are i.i.d., 
and for any k,j, Vkj is the uniform distribution in {—1, 0, +1}. They show the convergence of the 
head of the snake for the Skohorod topology, and the convergence of the moments of the maximum 
of Fn arc also given. This study was motivated by the deep relation between this model of discrete 
snake and random rooted quadr angulations, underlined by the authors. 

• Marckert &; Mokkadem [20] studied also the case fi ~ Geom(l/2) but with more general centered 
displacements that have moments of order 6 + £ (the distribution does not depend on k,j, 
but ffe is not assumed to be Uk,! x • • • x Uk^k)- The convergence of the head of the snake holds in 
(C[0, 1], M^) and the convergence of the snake itself is given thanks to a "homeomorphism theorem" 
which implies that the convergence of the snake and of its tour (in space of continuous functions) are 
equivalent. Here it implies that under the hypothesis of Theorem 1, the discrete snake associated 
with our model of labeled trees converges weakly to the BSBE (see [20] for more details). 

Then some generalizations appears few months later: 

• Gittcnbcrger [11] provides a generalization of a lemma from [20] and consider snakes with un- 
derlying trees GW trees conditioned by the size (condition equivalent to Hi). The displacements 
must be centered and have moments of order 8 + e. 

• Janson Sz Marckert [13] show that in the i.i.d. case {ukj do not depend on {k,j)), moments of 
order 4 + e are necessary and needed to get the convergence the BSBE. If no such moment exists 
the convergence to a "hairy snake" is proved under the Hausdorff topology. 

• In Marckert &; Miermont [19], the case of u^j depending of k,j is investigated (also the underlying 
GW trees are allowed to have two types). The hypothesis are for each k,j, rukj = 0, condition (H2) 
is satisfied, and then ^ fJ^k^^l j < +00. A motivation was to generalize the works of Chassaing &i 
Schaeffer [7] concerning quadrangulations to bipartite maps. 

Another important point is the convergence of the occupation measure of the head of the discrete 
snake to the one of the BSBE, the random measure named ISE (the integrated superBrownian 
excursion introduced by Aldous [3], see also Le Gall [16] and [20, 13]). Using the convergence of 
discrete snake to the BSBE, Bousquct-Mclou [4] and Bousquct-Mclou & Janson [5] deduce new 
results on ISE and on the BSBE; for example, some properties on the support of ISE, and of the 
random density of ISE are derived. We refer also to Le Gall [15] for the convergence of discrete 
snake conditioned to stay positive. 

The novelty in the present paper is that the condition {rukj = 0,VA;,j} is replaced by m = 
Sa;>i Sj=i l^k'mkj = 0. This allows to consider some natural models where, for example, the dis- 
placements are not random knowing the underlying tree (see Section 1.3). The proof of Theorem 1 
relies in part on some results from [19], and on a new approach, necessary to control the contribu- 
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tion of the mean of the displacements; the main point for this, is the comparison of the Uneage of 
each node, with some multinomial r.v. : this is the aim of Theorem 2, that we think interesting in 
itself, since it reveals a thin global behavior of GW trees conditioned by the size. Unfortunately, 
the price of this generalization is to consider only offspring distribution with bounded support. 
The reason comes from the proof of Theorem 2. We guess that some generalization for all families 
of GW trees (with finite variance) may be found, but for this, a control of an infinite sequence of 
processes arising in Theorem 2 should be provided, what we were unable to do. 

1.2 On the lineage of nodes 

Assume that (Hi) and (H2) holds. Let K he a bound of the offspring distribution. For u = 
ii . . . ih £ T, let Uj = ii . . . ij and [0, u] = {0 = uq, mi, . . . , u^u\} be the ancestral line of u back to 
the root. Conditionally on T, l{u) owns the following representations : 



where l{um) — l{um-i) is t'fcj-distributed when Cu^_i{T) = k and im = j, where u^j is the jth 
marginal of i^fc, and where the r.v. {i{um) — ^(wm-i))'s are independent; the variables (■{um) — 
£{um-i) will be often called displacements. 

Consider the array Ik = 1 < j < k < K}. Let ti be a node of T. For any {k,j) G Ik, 

let Au^ii.j(T) be the number of strict ancestors u of u (the nodes v £ [0,n[) such that c^(T) = k, 
and such that n is a descendant of vj, the jth child of v (we write fviu) = j). We say that v is an 
ancestor of type k^j of u, and we call the vector = {A^^iji^ij^ the lineage of u (or the content 
of [0,?xl). See Figure 3. 



\u\ 




(1) 



m=l 




Figure 3: On this tree A^^i^i — 1,^«,2,2 — l,^u,4,2 = l,^u,5,3 ~ 1, the others A^^i are 0. 



By (1), conditionally on T, the label i{u) owns the following representations : 




(fcj)e/x 1=1 



where the r.v. Y"^ j are independent, and where for any Y^j is z^^j distributed. In order to make 
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more apparent the contribution of the m^j's , and using that m = 0, write 

■^u,k,j 

= H J2 i^k] - ^k,j) + i^u,k,j- t^k\u\)mk,j. (2) 

ik,j)eiK i=i {k,j)eiK 

Assume that T is P„ distributed, and that u = u{ns) for some s G (0, 1). Conditionally on l^^l, we 
will see that both parts of the right hand side of (2) divided by n^/^ converge in distribution, and 
the limit r.v. arc independent : in the first part, the fluctuations of A^^kj around Hk\u\ are not 
important, while in the second sum only the fluctuations of around fj.k\u\ matter. 

We now concentrate on the r.v. {Aufs under P„. For any I G |0, n], {k,j) G Ik, set 

g(fci)(0 := ^u{l),k,j - f^k\u{l)\. 

For every {k,j) G Ik, the process I g|j!j)(0 encodes the evolution of the number of ancestors of 
type k,j of u{l), when I varies. Consider G^") = (G(")(s)),g[o,i] the process taking its values in 
R^'^ defined by : For any s, G*^"^(s) = i^k^ji^)) (^k j)eiK ^^^^^ * ~^ ^k^ji^) the real continuous 

(n) 

process that interpolates g, ■ as follows : 

Mn).,._ ( L^^J ) + {-^} (gg ( Ln^ + IJ ) - gg ( [ns\ )) 

The random process G*^"-* encodes the lineage of all the nodes of T, and its limiting behavior is 
described by the following theorem. 

Theorem 2 Under (Hi), (H2) the following convergence in distribution holds in C([0, 1])*^^ x 
C[0, 1] endowed with the topology of the uniform convergence 

(GH,h„)^(G,h) 

n 

where h is defined as in Theorem 1 and G = {Gkj{s))(^k,j)eiK,se[o,i] is a real centered Gaussian 
field with the following covariance function : for any {k,j) and {k',j') in Ik, s and s' in [0, 1], 

cov (Gfej(s), Gfc'j'(s')) = {-l^kl^k' + Mfcl(fc,j)=(A;',/)) Hs, s'). (4) 

1.3 Comments, examples and applications 

1) Theorem 2 may be considered as the strongest result of this paper. It gives very precise 
information on the asymptotic behavior of the process G^ that encodes the lineage of all the 
nodes. This gives a "global asymptotic" property reminiscent of the properties of the distinguished 
branch in "a size biased GW tree" (see [17, chap. 11]). The restriction to offspring distributions 
having a bounded support comes from the proof of this result. 
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2) For any fixed {k,j) G Ir, knowing h, Gkj is a Gaussian process with covariance function 

cov (Gfcj(s), Gfcj(s')) = {-l4 + IJ-k) h(s, s'). 




In other words, the processes (Gfej-,h) has the same distribution as {y—fi^ + yUfcr,h), and then 
up to some multiphcativc constants, {Gkj,h.) is the head of a BSBE. As a simple consequence 
of Theorem 2, we have that (G^j, h)^;,^)^^^ is a sequence of heads of BSBE, and that for any 
{k,j) G Ik, 

(Gg,h„)if (G.,,h). (5) 

The dependence between the different processes Gkj is ruled out by (4). For any families of real 
numbers {h,j){k,j)eiK^ ^^ve 

A..Gg,Kj ^ |^^A,,,G.„hj. (6) 

We would like to stress on the following point: discrete snake are usually constructed with "two 
levels of randomness" : the underlying trees are random and so are the displacements given the 
underlying tree, and then BSBE appears to be a natural limit of these objects . Here, we provide 
some objects with only "one level of randomness" that converge to the Brownian snake. The BSBE 
appears as a kind of internal complexity measure in trees measuring the difference between the 
number of ancestors of type k,j and some expected quantities. 

3) Consider the case fi = ^{60 + 62), 1^2 = <5(+i^_i), of binary trees in which the displacements are 
not random: e{ul) - i{u) = +1 and i{u2) - i(u) = -1. We have m = and /J^ = ^(1 + 1) = 1 and 
Theorems 1 and 2 apply. Hence, the clear positive bias for Rn{t) for small values of t, disappears at 
the limit. Note also that this normalizing factor is exactly the same as if 1^2 = _i) 
(case where {i{ul) — £{u),i{u2) — £{u)) is equally likely (+1,-1) or (—1,+!)) and as if 1/2 = 
+ 6^i)f (case where the £{ul) - i{u) and e{u2) - liu) are i.i.d., uniform on { — 1, 1}). The 
question of the convergence of the discrete snake in the case 1^2 = appears first in Marckert 

[18] in relation with some properties of the rotation correspondence, and the difference between left 
and right depth in binary trees . The convergence of (r„) is not given in [18], but the convergence 
of the occupation measure of r„, "the discrete ISE", to ISE is established. We refer also to Janson 
[12] for recent developments concerning the same question. 

Further, notice that in this model, the label £{u) of a vertex u is £{u) = Au^2,i — ^u,2,2, that is 
the number of left steps minus the number of right steps necessary to climb from the root to u in 
the binary tree. The convergence of (r„) can be seen directly via the one of (G(")). 

(^2^1) 62^2 , h„) (G2,i, G2,2, h), (7) 

and then r„ = Gg"^ — G2^2 ^2,1 — G2,2 which is, conditionally to h and according to (4), a 
centered Gaussian process with covariance function h.{s,t). Here, the convergence of (r„) appears 
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to be a consequence of the convergence of 62,1 and 62,2, encoding the right depth and the left 
depth in binary trees. 

2 Proofs 

The proofs rely on a precise study of the lineage of the nodes under and in particular on the 
comparison of with a multinomial random variable. For this reason we first give some elements 
on multinomial distributions and on their asymptotic behaviors. We then proceed to the proof of 
Theorem 2, showing first the convergence of the uni-dimensional distribution then the convergence 
of the finite-dimensional distribution. The proof of Theorem 1 is given afterward. We think that 
some points of view especially in the description of the distribution of the lineages in trees under 
P„ should provide some new approaches to study the trees under P„. 

2.1 Prerequisite on multinomial distributions 

The contents of this section is quite classical. Consider p = {piji^ii^ the distribution on Ik-, 
defined by 

Pk,j ■= IJ-k for any {k,j) G Ir- 
We say that M.^^^ is a multinomial r.v. with parameter h and p, if, for any m = (mj)jg/^ 

Q,({m}) := P(AlW = m) = ( ^ ) J] P?' ^nmi"") 

where {i^ .y ) = /i'/dlie/ where for any n > 1, W[n] is the set of elements c = {ci)i^ij^ 

of N#^^, such that J2ieiK = ^• 

Recall that for any i ^ Ik, M^t^ is a binomial r.v. with parameters n and p;. 

In order to fit with further considerations, we introduce the H^Ik dimensional real vector 
Q{n,h) = {Qi{n,h))iaK defined by 

Qk,j{n, h) = - /Xfe h) for any {k, j) G Ik- 

Let Qoo = {Goo,i)ieiK be a centered Gaussian vector having as covariance function 

cov(^oo,i, Goo,i') = -PiPi' + Pi^i=i' for any i, i' e Ik- (8) 

Proposition 3 Let {h{n)) he a sequence of positive integers s.t. h(n)l\fn—^ A G (0, +00). Under 
(Hi) we have g{n,h{n)) ^ VXGoc in 

n 

Proof. This may be proved using classical tools. As pointed out by E. Rio in a personal discussion, 
this is also a consequence of the convergence of the empirical process to the Brownian bridge. We 
only sketch the proof (for A = 1) : let {Ui)i be a sequence of i.i.d. r.v. uniform on [0,1]. Let F„ 
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be the associated empirical distribution function and F the distribution function of U. Denote by 
9n = — F. According to Donsker [8], y^ffn b where b is a normahzed Brownian bridge. 

n 

Take q = {qi)i^fi a distribution on N and consider A/"^"^ = #{j, < j < n,Uj & [qi + ■ ■ ■ + 
9fe) Q'l + ■ ■ ■ + 9fe+i]}- Then (A/'^"^)fe>i is a multinomial r.v. with parameters n and q and satisfies 

{M^f^ - qkn)/Vn = Vn{gniqi H h ^fc+i) - 5n(9i H 1" Qk))- 

By Donsker, for any L > 0, {{Mj:"'^ — gfcn)/-y/n)^,^^ converges in distribution to (bg^_| h^fc+i ~ 

bgj_| i-<7fe)fc<L- The properties of b allow to conclude. □ 

The following Proposition will be used in the proof of the tightness of (G^")). 
Proposition 4 Under (Hi), for any /? > 1, there exists c > such that, for any h > 0, any n > 0, 

E(\\g{n,h)f,)<c{h/V^f\ 

Recall that all the norms are equivalent in R#^^. Here, we use ||?7||i = Yl{kj)eiK 
Proof. First, since \\U\\( < cYl \Uk,jf for some c> 0, E (\\g{n, < c^^k,j)GiK ^iln'^^"^ {M^l 

A*fe h) 1^). Since -M^^j is a binomial random variable with parameter /x^ and h, E(| (Al^'^j —fi^ h) \^) < 
C{fj,k, f3)h^^'^ where the constant C{fj,k,P) depends on /x^ and P (see Petrov [23], th. 2.10 p. 62). □ 

2.2 Decomposition of trees using the lineages 

For any /c e N, a forest with k roots is a /c-tuplc of planar trees / = (t^, . . . , t''). The size |/| of 
/ is + . . . \t^\. We denote by ffc = (T^, . . . ,T'^) a random forest in which the trees T-*-, . . . ,T'^ 
are i.i.d. GW trees with offspring distribution /x. For any a = {3k,j){k,j)eiK ^ write 

^i(a)= and iV2(a)= i^-3>k,3- 

{k,j)eiK {k,j)eiK 

Proposition 5 Let h be a non-negative integer. For any a G N^[/i], and any m G |0,n] ; 



fiVi(a)| = m-h, If'i+AT 2(3)1 =n + l 



m 



where f and f are two independent forests. 

Proof. To build a tree T of 7^ such that ^^(to) = a, we first build the branch b = l0,u{m)} : 
Exactly akj ancestors v among the h strict ancestors of u satisfy (cv{T), fy{u)) = {k,j). Hence, 
there are (^) way to build b. Then, we complete b in grafting on its neighbors some subtrees 
satisfying the following constraints. When = a, the number of subtrees rooted on the 

neighbors of the branch |0,«(m)| visited before u{m) (resp. after u{m)) are respectively 

iVi(a) = #{w,d{l0,u{m)lw) = l,w ^u{m)}, 
l + iV2(a) = #{{u{m)}U {w,d{lu(m),0l,w) = l,u{m) ^w}) . 
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Figure 4: The two forests considered in the decomposition 



See an illustration on Figure 4. The A'^i(a) subtrees must contain exactly m — \u{m)\ nodes (the 
nodes, among the m + 1 first, not on |0, u{m)}), and the 1 + A''2(a) subtrees must contain exactly 
n + l — m nodes (the nodes visited after u{m), u{m) included). In other words, we need two forests 
containing respectively m — h and n + 1 — m nodes. Hence, using simple considerations on the 
probability distribution of GW trees we get the announced result. □ 

A consequence of this Proposition is 



where A^^^^ is a multinomial random variable with parameters h and p. 

2.2.1 Few fact concerning random forests and random trees 

Let {Wi)i>Q be a random walk starting from with i.i.d. increments with distribution {fik)k>-i = 
{^ik+i)k>-i (that is with increment ^ — 1, where ^ is /i-distributed). We have 

Lemma 6 Assume (Hi). 

k 

(i) (Otter [22]) For any k > 1 and n>k, P(| f fc| = n) = -P(W„ = -k). 

n 

(ii) (Central local limit theorem (CLLT) ) 



fm'j sup„>o sup^>o xF{Wn = x) < +oo. 

(i) is often called "conjugation of tree principle" or "cyclical lemma", and may be found in Pitman 
[24, chap 5.1] and is usually attributed to Otter, Kemperman or Dvoretzky-Motzkin. 

(ii) is usually called the central local limit theorem (see Breuillard [6] for a state of the art). Recall 




(10) 



(11) 



P(|T| =n) 




(12) 
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that d is the span of /x. The support of Wn is included in — n + dN = {u E Z, u = —n + di,i E N}. 
A consequence of {i) and (ii) is that 

P(|T|=n)~^^, (13) 



the equivalent being taken along the subsequence where the left hand side is non-null. 

Proof of (in): sup„>o sup^>^^{x P(VF„ = x)} is bounded by the Tchebichev inequality. By (ii), 

s^Px<cV^Vn^Wn = x) ^^^^ s^Pn>oS^Px<cV^Vn^Wn = x) is finite. □ 

The following Lemma controls the maximum increment in the process H under P„. 

Lemma 7 Assume (Hi). For any c > then exists p > such that 

P„(^max{||«(/ + 1)1 - \uil)\\} > plognj = Oin'"). 

Proof. We just sketch the proof that deeply relies on the conjugation of tree principle. Take 
n + 1 i.i.d. r.v. Xi, . . . ,Xn+i, ;U-distributed. Conditionally on Y17=ii-^i ~ 1) = among the 
n + 1 shifted sequences (X-^^ • • • ? ^n+i)? (^2? • • • ? -^n+i? -^i)? • • • ? (^n+i? -^1} • • • ; ^rt)? cxactly one 
(X*, . . . ,X*_^i) corresponds to a sequence {cu,u G T) for a tree T £ Tn (where the are sorted 

according the depth first order), and {X*, . . . , X^_|_i) — (c„, u € T) for T under P„. 

The inequality ||'(i(/)| — \u{l + 1)|| = /i > 1 implies that \u{l + 1)| < |ti(Z)|, and the deepest 
common ancestor v of u{l + 1) and u{l) has depth \u{l + 1)| — 1. Assume that the tree is visited 
counterclockwise. The nodes in ]]?;,it(Z)| are visited consecutively, and each of them has at least 
one child. Under P, when traversing the tree clockwise (or by symmetry counterclockwise) the gap 
between two nodes having zero child is a geometrical r.v. Geom{fj,o) (we work from now on the 
usual LO order). Denote by Xi, . . . , Xn+i i.i.d. random variables /^-distributed and by Gi, G2, . . . 
the successive gaps between the zeros. 

P (m^G, > ^1 g(X, - 1) = -1^ = O (nV2p (^^^^G, > = o{n-% 

for p large enough. Note that the first maximum is taken on a random number of terms, a.s. 
bounded by n. By the conjugation of tree principle, we get the result. □ 

Remark 1 Using the same argument, one may control the depth of the last node u{n) : for any 
c > then exists p > such that 

Pn(|«(n)| >plogn) = 0(n-^). (14) 

For u e T,l e [0, \u\l and {k,j) G Ik, let Au,i,k,j be the number of ancestors v G such 
that d{u,v) < I, and for which CuiT) = k and fv{u) = j. 
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Lemma 8 (i) For every c> 0, there exists 7 > 0, such that for n large enough, 

P„ (3{k,j) G lK,ue T, \Au,k,j - y^fe|?i|| > 7V|ti|log nj < rT" 

{a) For every c > 0, there exists 7 > such that, for n large enough 

Pn (3(/c, j) elK,ueT,le (0, \u\], \Au,i,k,j - Hkl\ > l^l logn) < n-\ 

Proof, (ii) clearly implies [i). But let us prove (i) first. Using (9) and (13), we have for some 
constant c > 0, for any m G [0,n], any ^ > 1, any a G N-'^f^], 

IPn(^„(m) = a) < cn3/2Q^(a)l;,<„. (15) 
Then P„ {^m G |0,n], (A;, j) G Ir, |^«(m),ifc,j - /^^^(Hll > 7 V KM I log") < 

n n 

cn=^/2 ^ ^p(3(fc, j) G Jk, \Mt] - likH > iVh^). 

m=0 h=0 

This latter probability is smaller, for any m < n, h < n than ^lRn~^^/'^ by Hoeffding. Hence 
(3(A;, j) G G T, - > 7Vh|logn) < cn^/^n-^T'. 

For (m), assume that u{m) = h and for I < h, take vi, . . . ,1;; the ancestors of u{m) at depth 
< hi < ■ ■ ■ < hi < h, and set ^^(^)^^ j. ,,- = #{i,Cvi = k, fy^{u{m)) = j}, the lineage of u{m) 
restricted to the nodes Vi's. By "symmetry", (^^(^)^/_fej)fc,j and iAu{m),i,k,j)k,j have the same 
distributions. Here "symmetry" means the following : let vi and V2 be two ancestors of u{m). 
Exchange in T, the two nodes vi and V2 together with the subtrees rooted on their children not on 
[[0,ii(m)]], as on figure 5. We get T' . First T' and T has the same weight under P„. Second, 
has the same value in T and T', and the nodes u{m) in T and T' have the same depth (u{m) is by 
definition the mth node). 




Figure 5: Exchange of two nodes in a lineage 

Now take v the ancestor of u(m) at depth /. By symmetry, {Av^k,j)k,j and {Au(^rn),i,k,j)k,j have 
the same distributions. And thus, by (i), for any m < n, I < n, F{3{k, j) G Ir, |^„(rre),i,fej ~ A^fcl^l | ^ 
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^\Jl logn) is certainly smaller than cnJ/'^n '^'^^ . As a direct consequence, cnJl'^'^'^n '^^'^ is a bound 
for P„ (3(A;, j) elK,ueT,le (0, \u\), \A^,i,k,j - f^kl] > T^H^g^). □ 

We end this section with a result concerning multinomial random variables. For any h > 0, set 

A = {a e (iVi(a), iV2(a)) G - h'/\ ^ + h^/f^ . 

Lemma 9 For he N, Ni{M^^^) and N2{M^''^) have the same law, and there exists ci > 0, C2 > 0, 
s.t 

F (m^ ^ Jh) < ci exp(-C2 h-^l^). 

Proof. The first assertion is easy. Writing {|iVi(A1 '^)- 4^ | > n^/^} C Ujtj{l-^fej— ^/"fel > Aifc^^^^} 
(recall that ^Mfe = by Hoeffding, one has P(|X^^j— > Mfc^^/^) < 2exp(-/x|//iV3)][^^^Q. 
Summing this for (A;, j) G 7^, one gets the results. □ 

2.2.2 A first comparison Lemma 

In this section S denote a Polish space. For any r.v. X taking its values in S, we denote by Px 
the distribution of X: that is Vx{A) =¥{X e A) for any A Borehan of S. 

Definition 1 Let (Yi,y2, • • • ) cind (Xi,X2, . . . ) he two sequences of r.v. taking their values in S 
such that Px„ is absolutely continuous with respect to Py^ , we write Px„ -< Py„ • Let fn be a negative 
measurable function fn such that Fx„ = fnFY„ ( which existence is ensured by the Radon-Nikodym 

theorem) : for any Borelian A of S , Px„(^) = Xi fnCl^Yr,- We say that Px„/Py„ ^ 1, orXnI/Yn 
1; if fn goes to 1 in the following (weak) sense : for any e > 0, the set A^ = {x, \ fn{x) — 1\ < e} 
satisfies Py„(^^) ^ 1. 

lfXn//Ya 1 then Px„(^£ ) ^ 1, and for any B C A^, |P(y„ G 5)-P(X„ e B)\ < eF{Yn G B) and 
then the total variation distance between Xn and Yn, defined by sup^ Borelian I'P'l-'^n & B) — F{Yn G 
B)\ goes to 0. Hence, the following Lemma is a straightforward consequence of the Portmanteau 
theorem : 

Lemma 10 //X„//y„ -^1 andY^^Y then X^ ^ Y. 

n n 

2.2.3 Proof of the convergence of the uni-dimensional distributions in Theorem 2 

In this section we work under P„. Let X'^^ := (y4„(„j), and Y^ := {A^, \u{m)\) where 

the distribution of A^ knowing = h is simply Q^. The aim of this section is to compare X^^ 

with Yj^ and to deduce from the asymptotic behavior of 1^ some information on X^. The proof 
of the convergence of the finite-dimensional distributions will also use this strategy. 

For M > 0, and n G N, consider 

K,M = {{a,h),he V^[M-\M],a G Jh} ■ 

We have 
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Proposition 11 i) For any m,n, Px^ -< Py^- 

ii) For any s £ (0, 1), a > 0, there exists M s.t. for n large enough, P„(Y|^^j G A„_m) > 1 — a 
and for any M > 0, 



sup 

(a,/i)eA„,M 



[nsj 



\(y«,j=(a,/.)) 



- 1 



(16) 



iii) For any s G (0, 1), XJ^^^^//Y(^^^^ ^ 1. 



Proof, {in) is a consequence of {ii). Let a G N/[/i]. Since {^„(ot) = a} C {|u(m)| = h}, 
P„((74„(^), |u(m)|) = (a, h)) = P„(^„(^) = a). According to Proposition 5, and Formula (10) 



"^(X- =(a,/i)) 

\{Yn = {a,h)) 



-Af2(a)l 



n + 1 — m) 



%i(A^('>)) 



m — h. If' 



H-iV2(A4(''))' 



ra + 1 — m 



(17) 



Then (z) holds true. Assume now that s G (0, 1) and a > are fixed. There exists M such that for 
n large enough, P„(|n([nsJ)| G v^[M-\M]) > 1 - a/2 (since h„ |e and since P(es = 0) = 
for any s G (0, 1)). For such a M, 



> 



P(|^(M)I = OPn {yCns\ e An | |«(M)| = l) 

zgV^[m-i,m] 

P„ (b(lnsj)l G min P f^W^') G A 



This infimum goes to 1 thanks to Lemma 9. According to Lemma 6 {i) and {ii), since f and f are 
independent, P(|f^r^(3)| = [ns\ - h, \ \ = n + 1 - [ns\) = 

([„.I-/!)'(?+'il'tj) ''"^^-^-^ = -A^.(a))P(ir„_,„j„ = -iV.(a) - 1) 



and then for any M > 0, 



sup 



mN,{.)\ = M - h, |f{+^,(3)| = n + 1 - Ln5j) ^ 

Qn,s,h 



for 



cr^/i^ exp ^ 
^"■^''^ " 87rn3(s(l - s))3/2 " 



8ns{l-s) 



Now, P ( I fjVi(M (/.))! = [ns\ - h, |f{+jv2(M(''))l = 1 + " - L^^j) =Ah + Bh where 



(|fjVi(MW)l = M - h, K+N2{M(h))\ = 1 + n - [ns\,M^^^ G a) 
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Using again Lemma 6 (i) and {ii), we get 

Bh 



sup 

h€^[M-^,M] 



Qn,s,h 



— y 0. 

n 



On the other hand, Ah < ¥{M^''^ ^ A) < ci cxp(-C2/i^/-'^) < 2 exp(-cn^/'5/M) for any h G 
^/n[M~^, M]. To complete the proof of (ii), check that supf^^^^j^-i ^j^j \Ah/Bh\ — > 0. □ 

Corollary 12 For any s G (0, 1), let Sn = [ns\/n, we have 

)) // {Q {n,V^hn{Sn)) ,K{Sn)) ^ 1, 

and the convergence of the uni- dimensional distributions holds in Theorem 2. 
Recall that Q is defined in Section 2.1. 

Proof. Proposition 11 yields the first assertion of the Corollary. For the second one, by Lemma 
10, it suffices to establish that for any s G [0, 1] 

{g{n,V^Kisn)),K{sn)) ^ {g^,K) = iG{s),K). (18) 

n 

For s = or s = 1, this is a consequence of h„(s) q por s G (0,1), since h„ h 

n n 

in C[0, 1], by the Skohorod representation theorem [14, Theorem 3.30], there exists a probability 
space on which this convergence is a.s.. On this space (or on an augmented space on which 
the pair (C/ (n, y'n h„(s„)) , h„(s„)) is defined), (18) holds a.s.. To prove that the convergence of 
the uni-dimensional distribution holds in Theorem 2, it remains to control the distance between 
(G(")(s„),h„(s„)) and (G^ (s), h„(s)) . Since h„ h, |h„(s„) - h„(s)| 0. For G^ this 

is more complex, and we will establish some bounds useful also for the tightness. Let 



= |r G Tn,max\\u{l + 1)| - \u{l)\\ < plognj. 



Let e > 0. According to Lemma 7, for p large enough, Fn{^n) > 1 — £ for n large enough. We have 
for = [ns + Ij/n, 

||G(")(s„) - G(")(.)||il^. = n{s - Sn) If^^ \Gf'\s'n) - Gf'\sn)\ . (19) 

In n?,, the differences |Gg«)-Gg( Sn)\ are bounded by pn logn. Hence, since s — Sn<^ ^/n, 
for any e' > 0, for n large enough 

||G(")(s„) - GW(5)||ila. < c{s - Sn)'/'-''. (20) 

for some constant c. One concludes that ||G(")(s„) — G(")(s)||i > Q. □ 
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2.3 Convergence of the finite-dimensional distributions 

In this Section, k > 2 is a fixed integer. We denote by s^*^) the vector {si,...,Sk) where 
< si < ■ ■ < Sk < I are fixed. Let T G 7^. For i G |1, k}, set Ui = u([nsij), uq = Un+i = 0. The 
aim of this section is to study the joint distribution of (^■ujieii,^] under P„. The ideas are of the 
same type as in the case of the uni-dimensional distributions, but the details are more involved since 
the dependences between the r.v. ^u/s must be taken into account. For this, we must consider 
the shape of the tree spanned by the Uj's. 

Denote by Ui,j the deepest (i.e. youngest) common ancestor between Uj and Uj. Let T^iK) = 
|JJL;^[[0, liij be the subtree "spanned" by the Uj's, L{T) the set {ui,i G |0,k;]]}, and Z{T) = 
{uijj ^<i<j<K} = {ui^i-^-l,i G [1, K — ll}, the set of branching nodes in Tg(„) . 

Definition 2 The shape function b associates with the smallest tree having the same shape. 
Formally 

b: Tx[0,l]'^ — > T 
(T,sW) I — y 

where is characterized by : 

i) T^ and i^T^ = #{Z(T) U L(T)), 

ii) there exists an increasing function from Z{T) U L{T) in , preserving the descendants : 

is an ancestor of ^t{v) in ifffu is an ancestor of v in T. 

In other words, is the only tree with #(Z(r) UL(r)) nodes having the same branching structure 

as Tg(K) (see Figure 6); it can be constructed in somehow squeezing the paths between the nodes of 
Z(T) U L(T) in unit length edge (and in renaming the vertices in order to get a tree). The function 
$r is unique and, for short, for any u G Z{T) U L{T), we write u'^ instead of ^t{u). 




Figure 6: A tree T and the associated tree $(T) = {0, 1, 11, 111, 112, 12, 121, 122, 123}. 

The set <B{T) of "spanned branches" between the nodes of L{T) U Z{T) is defined by 

(B{T) = {{u,v),u,veL{T)UZ{T),u^v,u^v,ju,vln{L{T)UZ{T)) = $} 
= {{u, v),u,ve L{T) U Z{T), = la.{v^)}, 

where fa('u) stands for the father of u. Notice that if (u, w) G ^(T) then u is an ancestor of v. The 
LO order induces an order on €(T) (also denoted by -<) : for ei = {ui,vi) and 62 = {u2, V2) G <B{T), 
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ei -< 62 if vi -< V2- We denote by Ht = {#}u, vf) (^u,v)e<£(T) the ordered list of the spanned branches 
distances. 

In order to control the dependences between the A„.'s, we study analogous quantities associated 
with spanned branches. For any {u,v) G ^{T), define A^y^^^j, the content of the edge {u,v) by 

Au,v),k,j ■= # {W eju, Vl, Cyj=k, fyj{v) = j} . 

The contribution of the extremities of the spanned branches arc not counted in any of the ^(u,i;),A;j's 
in order to simplify the enumerations in the rest of the paper. It is easy to check that 

A{u,v),k,j = {Av,k,j - Au,k,j) - '^{CuJuiv)){k,j)- (21) 

The contributions of the nodes of Z{T) are encoded by the sequence Qt, ordered by -< : 

er = (cu,/«(^(T^)))„ez(T)u{0}- 
Notice that fu{L{T)) is a subset of |1, . . . , Cy\ with Cyb elements. 

2.3.1 Subtrees visited between two elements of L{T) 

(An illustration of the quantities considered in this section is given on Figure 7) . We denote by 

Sq = {v^T,d{v,l0,uil) = l,0^v^ui}, 

Si = {v eT,d{v,jui,Ui+il) = l,Ui -< V -< Ui+i}U {ui} foriG|l,K], 
S^+i = {veT,d{v,ju^,0j) = l,u^^v}lJ{u^} 

the set of the roots of the subtrees, rooted on the neighbors of |nj,Mj+i]], visited by the depth 
first traversal between Ui and Uj+i (up to the borders cff'ccts). The cardinalities of the set S'l s are 
characterized by the triplet [At, Qt, T^) , where At = [A(^y y-^^^^ v)e€T ' ^'^^ ^ ^ have 

#Si{T\AT,eT) =M,i+i{T\AT) + yi,i+i{T\eT) (22) 

where, for any I, 

yi,i+,{T\&T) = li^o+/«M+i(^/+i)-/«M+i(^')-l+ E c,-/,(n,)+ Uz{ui)-1) 
and 

Mi,1+i{T\At) = 5]iVi(^(,,,,,)) +^iV2(A(,,,,,)), 

where the first sum is taken on the pairs (-Zi, 2:2) such that {z\, Z2) G [[""n+i) ^^f+iL and z\ = ia,{z\) 
and the second one, on the pairs {z\,Z2) such that (2:1,2:2) ^ [^^f^+i) '^'i'+il ^2 — ^^i^i)- 

This is similar to Proposition 5 : Mi^i+iiT'' , At) counts the number of subtrees rooted on the 
neighbors of the spanned branches , on their right or on their left; >'i^/+i(r^, Qt) counts the number 
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of subtrees rooted on the neighbors of the nodes of Z{T). We let T„ = G Too : uv E T} be the 
fringe subtree of T rooted at u. The cardinahty Fi = ^{Ty)u^Si of the forests constituted with the 
fringe subtrees of T rooted on the nodes of Si satisfies 

Fi{nT,T^) = {nsi+i - nsi + 1) - {\ui+i\ - \ui,i+i\) - li=o, (23) 

since the visit times of the nodes Uj are nsi and since \ui+i\ — \ui^i^i\ + 1 nodes visited during 
lnsi,nsi+ij are not in {Tu)ueSr 

2.3.2 Decomposition of a tree T given the At and T** 

Let r2^_i = {T e deg(0) = 1, Vn G T \ 0, deg('u) G {0, 2}} be the set of trees with 2k- 1 

edges, with binary branching points (except the root that has only one child). Denote by 

A„,M = {T€ Tn,y{u,v) G (t{T),d{u,v) G V^[M-i,M],r^ G T^,}. 

A tree in A„^m has its shape in and all its spanned branches lengths in ^/n[M~^^M\. 

Lemma 13 For any £ > 0, there exists M > such that for n large enough 

Pn(A„,M) > 1 - £. 

Proof. This is a consequence of h„ h = 2e/c7u and the properties of e : e is a.s. non null on 

n 

(0, 1), and the local minima of e are a.s. all different (the continuum random tree is a.s. a binary 
tree). □ 

For any T G A„^m, $t sends the nodes of L(T) \ {0} on the leaves of T^, and the nodes of 
Z(T) on the internal nodes of (different from 0), #Z(T) = k — 1 (the branching nodes are 
distinct), L{T) n Z{T) = and Qt belongs to = V2 x V^"^ where V2 = {{c,x), 1 < x < c} 
and V3 = {{c,x,y), 1 <x <y <c}. Note that under (Hi), is a subset of '^"^ 

2.3.3 A second comparison result 

On the first hand, consider {An, Hn, ©nj T^) the r.v. (^t, Ht, ©t, T^) when T is P„-distributed. 
On the other hand, we define (^*, Hn, T^) as follows. Let n„ = {H^, . . • ,Hf ^") and A* = 
(.4.n*)i=i, ...,#!£„• Conditionally on Hn, the r.v. An*s are independent with respective distribution 
Qni- The r.v. 6* = (e*(0)i=i,...,« is independent of (W„,^*,T^) and : 

P(e*(l) = (o,ii)) = Aioforany (o,ji)GP2 
F{eiii^ = {o,j\f)) = jlo:=f^o/<7^iorSiny{oi,j\f)eV3,i>2. 

Since the mean and the variance under jj, are respectively 1 and cr^, these formulas define indeed 
two distributions. For any object O and I G N, denote by = (Oi, . . . ,0i). Let 

rn,M = {(a[2-il,x[2''-i],^M,T^),x,G V^[M-\M],a,G Jx^T^Grf^.i} 

nsupp (A'n,Hn,el,T^n) ■ 
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Figure 7: The ordered spanned branches are (0, zi), (zi, Z2), (22, {z2, U2), (-zi, W3), 8^ = 
((4, 2), (6, 2, 5), (5, 2, 4)). The triangles are subtrees not drawn. The forests considered in the 
decomposition are surrendered. 



The following Proposition generalizes to finite-dimensional distributions the Proposition 5. 
Proposition 14 (i) For any e > 0, there exists M such that 

P ({Al, Hn, @*n, Ti) G r„,M) > 1 - e. 



sup 



((A,W„,e„,T:;) = (a,x,0,T'')) 



0. 



(24) 



Comments 1 In general IP(^„,-K„,e„,T^) 7^ ^f'^(yl*,'W„,e*,T^) since supp(0*) is strictly included in 
supp(0n) when /x[3, +00) > (the variable Q* mimics the coding of binary branchings on Tn)- 
that case, moreover P„(T^ ^ ^^-1) > forn large enough, and no control of {Am'Hm&n,T^n) ^-^ 
provided on CT^_]^. Notice that the condition Px„ -< Py„ in Definition 1 may be replaced by the 
following weaker condition sufficient to keep the conclusion of Lemma 10 : for any e > 0, forn large 
enough, there exists a measurable set such that Py^(^^) > 1 — e and a function f^ : A^ 1— > M 
satisfying Px„ = /n^y„ on A^ and sup |/^ — 1| < e. Here, this is the case on Tn^M- 

Proof, (i) is a consequence of Lemmas 13 and 7. 

For (a), let {a,x,9,T^) G r„,A/ for 9 = {{oi, jl), {02, jl, j^), . . . , {Of,^i, . By classical 

properties of GW trees P {{An, Hn, Qn, T^) = (a, x, 9, t^)) 



'2k~\ 



.(0 



Fl{x,T^),0 <1<K+1 



\i=l 



P(|T| = n) 



(25) 
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where the f'^'^'s are independent forests. On the other hand, P ((.4*, T^nj ©n; '^n) = (^i ''"'')) 

^(^:,e;) = (a,f?) I {nn,T'n) = (x,r^)) P ((Wn,T^) = (x,r^)) 

(W„,T^) = (x,r^ 



n «2x.(a.) n 



1=1 



\i=l 



summing formula (25) on all possible values of the a^'s and the 0's leads to 



P(|T| =n) 



(26) 



where m = (mj)j=i^...^^ is a vector of k multinomial independent r.v. (the parameters of rrij are Xj 
and p), and where 6 = {0{i))ieii,Kj — ©n is independent of m. Hence, for (a,x, ^,r'') G r„^M, 



' ((-4„,7^„,e„,T,^J = (a,x,^,r^)) _ ^ (|f|W^a,^)l = ^Kx,r^),0 < I < . + l) 



V #5(r'',m,e)' tv ' — — I y 

It is easy to check that for any (a,x, r'',^) in Tn,M, any I, for n large enough 

^2 



(27) 



IFKx.r") -n(si+i -sOI < n^/^ |#5Kr^a,^) - yd(nz,u,+i)| < n^/'\ 

since (1/2)^/"^ < 5/12. This allows to approximate on one hand Fi{x, t^) by ra(s;+i — Si), and on the 
other hand #5/(r^,a,^?) by ^d{ui,ui-^-i) on F^^m (since n^/^^ = o(n^/^), the order of d{ui,ui+i)). 
So, using Otter and the central local limit theorem and also a decomposition of the denominator 
along {m G H "^xj} or in its complements (as in the proof of Proposition 11), we get 



sup 

{a,X,e,Tl,)eTn,M 



0. □ 



(28) 



2.3.4 Proof of the convergence of the finite-dimensional distribution in Theorem 2 

We now show that Proposition 14 implies the convergence of the finite-dimensional distributions 
in Theorem 2. The proof is similar to the one of Corollary 12. 

Thanks to the Skohorod representation theorem [14, Theorem 3.30], there exists a probability 
space J7 on which the convergence of h„ to h is a.s.. On J7, the vector 

Vn = (hn(si),h„(si,S2),h„(s2),h„(s2,S3), • • • ,h„(s«)), 

which determines as well as the length of the spanned branches, converges a.s. to 

Voo = (h(si), h(si, S2), h(s2), h(s2, S3), • • • , Hsk)), 
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which determines Tg the subtree of the continuum random tree Too , with contour process h, spanned 
by the root and the nodes visited at times si,...,Sk (see Aldous [1, 2]). With probabihty 1, 
the coordinates of Vqo are distinct and non zero, and then Tg has its shape in T^_^. Let 
T~ioo = ("^oo 2K-111 be the lengths of the (sorted) spanned branches in t^o- On O, -^^^ ri*, 
and for M large enough, (M depending on Too), and n large enough, T„ G Tn^M- 

Denote by (^^)je[i,2K-i| the (sorted) corresponding content of the spanned branches of T„, 
and by ('H^)ig|i^2K-i] = (|^nl)ie[i,2«;-i] their lengths. The normalized contents are then given by 

A consequence of Proposition 14, is that 

((gf^k[l,2«-ll,Wn/v^)//((e« (^i'^n))ie[l,2K-l],'^n/-\/^) ^ 1, 

in the sense of Comment 1 (which slightly modifies Definition 1) where the r.v. ^^*^(n, H^J's are 
independent, and conditionally on = (^'■^^ (n, H^J) = Q{n,l). On Vt, 7in/\/n ^^^^ Ti-oo, 

n 

and then by Proposition 3, (^/'■*-'(n,'H*j))jgp 2k-i] converges in distribution to a centered Gaussian 
vector (^,^'*)ie[i,2«;-i] with independent coordinates, where has variance 7ico,i- This implies 
the convergence of the finite-dimensional distributions in Theorem 2. □ 

2.4 Tightness in Theorem 2 

We only prove the tightness of the family (G"^), since one already knows that (h„) is tight (since 
h„ h). In this section, we assume (Hi) and (H2). 

n 

We collect in the set fin^'^''', the trees with n edges having some suitable properties : 
^a,S,j,P = {rGr„,Vt,sG [0,l],|h„(s)-h„(t)| <5|t-s|",max||u(Z + l)|-|n(/)|| </9logn, 
\u{n)\ < plogn,V(fe, j) elK,l& (0, \u\], \Au,i,k,j - IJ-kA < lognj 

Lemma 15 For any e > 0, a < 1/2, there exists 6 > Q, j > 0, p> 0, s.t. P(^^"'''''^''') >l-e. 

According to Lemmas 7 and 8, and Remark 1, only the condition on the Holderienity of H has to 
be checked. This is postponed at the end of the paper. 

Let e > be fixed. Set a = 2/5 and choose 5>0, 7>0, p>0 s.t. P(On''''^''') > 1 - £ forn 
large enough. For these choices, write fig instead of ^In' 
We will establish the following Proposition. 

Proposition 16 There exists a > 0, (3 > 0, c > s.t. for n large enough, 

e(^||G^ -G^||flj7,) < c|i-s|i+" /or any s,i G [0,1]. (29) 
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This imphes that the (1 + a)//3-Holder norm of the family (G") is tight, and then that (G") is 
tight in C([0, 1])*^'< (recall that is the null vector oiR*^^). 

We first point out that using (20), we get that for any a > there exists f3 > such that for n 
large enough E(||G*-") (s„) — G*-")(s)||^lLf^p) < c{s — Sn)^^'^- Hence, wc can restrict oursclf to prove 
(29) only for s and t such that ns and nt are integer (this is classical). From now on, we assume 
that s,t are in [0, 1]„ := [0, 1] n N/n, and s / t. 

We set ui = u{ [ns\),U2 = u{ [nt\), ui^2 their deepest common ancestor and Dn{s, t) = d{ui,U2)- 
There exists S' > 0, such that for T G any s,t e [0, 1]„, s ^ t, 

Dnis,t) <2 + Hnint) + Hnins)-2 min Hn{k) < 2^/E^\t - sl"^/^ + 2 < S'Vn\t - sl"^/^ . 

kE[ns,nt] 

Lemma 17 For any a' > 0,a > 0, there exists /3 > 0,c > s.t. for any s,t G [0, 1] such that 
\s — t\ < (logn)~^, for n large enough, 

E(||G^-G^||f In,) < c|i-s|i+«. (30) 

Proof. Let s,t e [0, 1]„, s ^ t. We use a deterministic bound valid for all trees T in 0,^. Let 
{k, j) G Ik fixed. As in the proof of Proposition 4, it suffices to show that 

n~^/^ |-4„i,fc,j - Mfcl«i| - ^U2,k,j + I^k\u2\f < c\s - ^1^+" (31) 
Passing via 111^25 the left hand side of (31) is smaller than 

cin-^/^ (\A^,,huk,j - f^k\hi\f + \AuM,3 - Mfcl^2||^ + 2^) 

where hi := d(ui, 111^2) — 1 and /12 := d{u2,ui^2) — 1 (the contribution of ui^2 is bounded by the 
term 2). Using that \Au^i,k,j — /"fc^l < iV^ logn for any I and I < Dn{s, t) < 5'v}/'^\t — sp/^, we find 

n-'^'^ \Au,,k,j - iJik\ui\\^ + \Au^,k,j - tik\u2\\^ <C2{t- sf'\\ogn)f^/'' 

and since |t — s| < (logn)~^, \t — s|^/^(log n)^/^ < 1 and then C2{t — s)^/^(logn)^/^ is smaller than 
\t — s\^^°' for /3 and n large enough. □ 

Lemma 18 For any a > 0, there exists P > 0, c > s.t. for any t G [0, 1], for any n large enough, 

E(||Gril?lf2e) <ct'+^ (32) 

Proof. Consider first the case t = 1. In $7^, we have \u(n)\ < plogn and then 

E(||G?||flf^J<ci(plogrz)^n-^/2 

and this is smaller than cl^"*"" for any a > 0, c > 0, /? > for n large enough. By the previous 
Lemma and a simple computation (using that max ||'u(Z + 1)| — |'u(Z)|| < plogn) one sees that (32) 
is true if t ^ where 

F„:= [(logn)-^l-(logn)-3]. 
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Assume now that t G V^. In ^e, the Holder property of h„ and the inequahty \u{n)\ < plogn, 
imphes that for t G V^, 

\uilnt\)\ < L;(t) := C2nV2[t A (1 -t)]". (33) 

For any real number a, we denote by a./j. the vector {afj,k)(k,j)&iK- Using (5) and (13), there 
exists C3 > such that fov t EVn and n large enough, E 

C3 E E "'7/4-3^5 ^ (ifiv,(a)i = - if = n + 1 - m) 

h<Ln(t) aeN[^j 
and by Otter than 

^ ^ . ||a - h.fif, iVi(a)(l + N2ia))nWint\-h = iVi(a))P(M/„_LntJ+i = 1 + M^)) 
C4 2^ ^^/^^^^ „/3/4-3/2 nnt\ - h)in - \nt\ + 1) 

h<Ln{t) aeN^^j 

where (VFfe) is the random walk described in the beginning of Section 2.2.1. In order to bound 
these two last probabilities, we use a classical concentration property valid for any non-degenerate 
random walk {Wk)k (trivial consequence of Petrov [23, Theo. 2.22 p. 76]) : there exists a constant 
C5 such that for any n > 0, 

supP(W„ = y)<C5/v^. (34) 
y 

Now, for any a G Nj^j, A'^i(a) and iV2(a) are smaller than Kh, and for any h < Ln{t), t GVn and n 
large enough, nt — h > nt/2. We then get 

TO'nir-n 11/3-11 ^ ^ , Qfe(a)||a - 

HW^thilnJ < C6 2^ ^ n/3/4-3/2(L„tJ- ^)3/2(„_LntJ + 1)3/2- 

Using Proposition 4, we obtain that for any t G Vn, 

E(||G, II, InJ < C7^,/4+3/2(,(l_,))3/2 ^ ^« (t(l-t))3/2 " ^ 

Remark 2 T/ie /ast formula implies that for any a > 0, there exists P > 0, c > s.t. for any 
t EVn, for any n large enough, 

E(||Gri|?lf^J<c(tA(l-t))i+«. (35) 

This allows to prove a part of Proposition 16 : since E(||GJ^-G^||f InJ < cE(la^(||Gf ||f + ||G^||^)) 
when s, t G y„ o,nd s <t, 

-if s<t-s (in this case t<2{t- s)) then E(||Gf - G^||flnJ < c{t - s)^+", 
-ifl — t<t — s (in this case 1 — s < 2{t — s)) then 

EdlG^ - G^ll^lnJ < c((l - 0'+" + (1 - s)^+") < C2{t - s)i+«. 
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Thanks to this remarks, only the case s,t EVn, s <t, and 

[s A (1 - s)] > t - s and [t A {1 - t)] > t - s (36) 

remains to be checked. So assume that s and t satisfies these constraints. 

Consider A„ = {Al,Al,Al) = (^(«i,2,«i)> ^(«i,2,w2)' ^(0,ui.2)) contents of the "three" 
spanned branches in (some of these spanned branches may be empty). We have 

^,{Al = ai,i = 1,2,3) fllai - + ||a2 - /i2./^||? 



E 



|G^-Gr||fln.)< E E 4^7^ ^(37) 



hl,h2,h3 31,32,33 



where the first sum is taken on /11+/13 < Ln{s), Z'^ + ^s < Ln{t), /ii + /i2 < Dn{s,t) := (5'n"^/^|t — 
where Ln{x) is given in (33). By Section 2.3.1, and the Otter formula, /ii, /12, /13, ai, 82, 83 fixed, 

F^iAl, = a„z = 1,2,3) < cn^/^ sup J] Q,, (a.) ^^^M]^^^^ (38) 

^ i=l * 

where the supremum is taken on 9 = {61,62, 63) G |0, K}^, and where Fi = ns + 1 — \u{lns\)\ — 1, 
F2 =n{t-s) + l- {\u{lnt\)\ - |ui,2|), F3 = n(l - t) + 1, Si = 7Vi(a3) + iVi(ai) + di, S2 = 
iV2(ai) + iVi(a2) + 9-2, = iV2(a2) + iV2(a3) + ^3. 

We plug this bounds in (37), and bound the left hand side using the following ingredient: 

- the probabilities in (38) involving the random walks are bounded using (34). 

- for a G Nj^j, A^i(a) < Kh and then for a constant c > 0, 

51 < K\u{[ns\)\ + 6i<cL^{s), 

52 < c\D;;{s,t)\, 

53 < K\u{lnt\)\ + 63<cL^{t). 

The denominator are bounded using \t-s\ > (logn)-^, [tA(l-t)] > (logn)-^, [sA(l-s)] > (logn)"^, 
and then for n large enough, 

Fi > ns/2, F2 > n{t - l)/2, F3 > n{l - t)/2. 

Finally wc get that the left hand side of (37) is smaller than 



Ln{s)Ln{t)Dn{s,t) T.h,MM ^31,32,33 nLi ^hM) - + I|a2 - /i2./x||? 



[n3(s A (1 - s)){t A (1 - t)){t - s)]^/^ 

The double sum is smaller than 

hi,h'2,hz 

this last factor Ln{s) being a bound of /13. Finally, 

E (\\g: - Giti..) < cn3/^-^/^ (i;(.))-i;(t)(A;(M))^/-^^ 

By (36), it suffices to take /? large enough. □ 
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2.5 Proof of Theorem 1 

Consider the representation of i{u) given in (2). For any s such that ns is an integer, 

r^{s) = rW(s) + r(^\s) (39) 



where 



{k,j)eiK '=1 
(A:,i)e/K 

where m = {mk,j){k,j)&iK ^'^d < a,6 >= 'l2(k,j)&iK ^kjhj- For s in [i/n, (i + l)/,n], ri"^^(s) and 
rn\s) are defined by hnear interpolation. Since h„ h in C[0, 1], by the Skohorod representation 

n 

theorem [14, Theorem 3.30], there exists a probabihty space Q on which this convergence is a.s.. 
On this space by Theorem 2, G^"-' converges in distribution in C([0, 1])*^'^ to G*", where G*" has 
the distribution of G knowing h. Now, since the application 

C([0,1])#^- (C[0,1]) 

{st-^g{s)) I — {s^< g{s),rn >) 

is continuous, on CI we have 

<G("),m> r(2) :=< 0*^,7^ > (40) 

n 

in C([0, 1]). On J7, r*^^) is a centered Gaussian process with covariance function 

cov (r(2)(s),r(2)(t)) = h(s,t) ^ ^ (-//fc//fc/ + /;/fcl(fcj)=(fc/j/))mfejmfc^^-. 

(fej)6/K {k',j')elK 

On the other hand, is the standard head of a discrete snake associated with independent 
displacements. As shown in [19], under (Hi) and (H2), 

r(i) ^ r(i) (41) 

in C([0, 1],M) where r^-*^) is a centered Gaussian process with covariance function 

cov(rW(.),rW(i))=h(.,t) 

(fe,i)e//f 

We shall prove that, given h, the finite-dimensional distributions of r^^^ and r^^^ are independent. 
We establish the "asymptotic independence" between the two prOCGSSGS Tn ^) and r(f) knowing h. 
The arguments are quite straightforward; we just explicit the uni-dimensional case. Let 
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According to Lemma 9 in [19] (it is also a consequence of Lemma 15), for any > 0, e > 0, if n is 
large enough P^C?^) > 1 — e. Let s G [0, 1] (such that ns is an integer), one may compare 

L«|w(M)|-nV4+.^J 

{k,j)eiK i=i 

where the same r.v. F^^'j are involved in both and rn \ with rn\s). Since knowing |u([nsj)|, 
r^(s) is clearly independent of r^j'^(s), the proof of \Yn {s) — r'^(s)| ^ will prove our claim 

n 

(in the uni-dimensional case). We have 

P„(|rU.)-rW|>x) < P(|rU.)-rW|>x,r„-)+P„(r„\7;^). 

The last term goes to for any > 0. The Rosenthal inequality [23, Theorem 2.11] asserts that if 
(Xjfc)jfc is a sequence of centered r.v. and q>2, then 



E 



(lE^^r) <c(g)(E^(i^^n + (E^^(^^))'0 ^^'^^ 
1=1 1=1 1=1 



where c{q) is a positive constant depending only on q. For p satisfying (H2), we have 

nir^s) - rW| > x,T:) < E (x'^lrU.) - r«|^lr„.) 

Conditioning at first by the ^(„(ns)), and using (42), we get T{K{,) - r«| > i.T^) < 

--MP)/ ^ 2„>/^."E(|y.„-™.,n + ( 5: 2„>/^.vL.,)'"') 



and then for 1/ < 1/4, for any x > the bound goes to 0. 

Hence r is a centered Gaussian process with covariancc function sum of the ones of r*^^) and 
r(2). Using that J2^k,j) fJ'kmkj = m = 0, we get cov(r(s), r(i)) = h(s, t) J2^k,j) l^k^O^kj)- ^■ 

2.6 Proof of Lemma 15 

We say that a sequence of processes (n„) defined on [0, 1] is uniformly Holder continuous in 
probability with exponent a (a-UHCP) if for every e > there exists a real number such that 
for every n, 

F{\unis) - u„(t)| < Ce\t - s|" for all s,t£ [0,1]) >l-e. 
Only the following condition on (h„) remains to be checked : 

Lemma 19 (h„) is a-UHCP for any a < 1/2. 
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Proof : For any tree T E T, Ht is a simple function of Ht ■ let mT(0) = 0, and for any 
i > 1, niTii) = min{j, j > mT{i — l),i?T(j) > Ht{3 — 1)}, then Hj'{k) = Hj'{mT{k)). In fact, 
mT{k) = inf{j, Ft(j') = u{k)}. One may check inductively on k that, 

mrik) + Hrik) = 2k for any k>0. (43) 

Assume that |h„(s) — h„(i)| < ci|i — s|" and |h„(s) — h„(t)| < C2\t — s\^, for some a,ci,C2 > 
0,/? G [0, 1/2] for any s,t G [0, 1]. Then for s,t e [0, 

m{ns) — m{nt) 



|h„(s)-h„(i)| 



< Cl 

< Cl 



h„(m(ns)/(2n)) - h„(m(nt)/(2n)) 
ii"(nt) - H{ns 



<ci 



2n 



<ci 



2n 



S - t + C2 - 



S-t + C2|i-s|W2 . 

Since when G [0, 1]„ and /3 < 1/2, we have \s — t\ < \t — Hence, for any s,t G [0, 1], by 

interpolation, we get that |h„(s) — h„(t)| < c^\t — s|"('5+^/2) for a certain constant C3. Hence, if h„ 
is a-UHCP and h„ is /3-UHCP then h„ is /«(/?) := + l/2)-UHCP 
By Gittenberger [11], for all s, t, £ > 

P(|h;(s) - h;(t)| >£)<C^\s-t\-^ exp (-C2e\s - ij-V^) , 

which gives, for any p > 0, E ^ hn(s) — h„(t) ^ < C{p)\s — Taking p large enough, this 

ensures that for every a < 1/2, the family (h„) is a— UHCP. 

Fix one such a. Since h„ is 0-UHCP it is also /q(0)-UHCP, and by successive iterations /^(O)- 
UHCP. Using that fa is increasing and has f3a = 0/(2(1 — a)) as finite fix point, a finite number 
of iterations shows that h„ is /3-UHCP for every /? < /3q,. Finally, since limQ,_>i/2 Pa = 1/2, for any 
f3 < 1/2, we can choose a < 1/2 such that (5 < Pa- D 

2.7 Note on the comparison between height processes and contour processes 

Let T be a tree in Tn, Ft its depth first traversal, and ('u(/c))^.g|[o,n| , the sorted list of its nodes. 
Let C be a function from (ti(A:))fcg|[o,n+il taking its values in M, and let C and C the two processes 

C{k) = C{u{k)) and d{k) = C{FT{k)) for any k G lO,nj, 

and interpolated between integer points. Let Cn{t) = C{nt)/k{n) and c„(t) = C{2nt)/k{n), 



Proposition 20 Assume (Hi). If k{n) +00, if 



lognSUP; \C{1) — C{1 + 1)1 proba. 



k{n) 



0, then 



id) 



(cn(t))te[o,i] — ' (c(i))te[o,i] 



is equivalent to 



(d) 



{cnit))te[o,i] (c(i))te[o,i] 
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Proof. We will prove that ||c„ — c^jlloo ^''°^°"> 0. 

n 



For any / G |0, nj, let ji be the integer such that rriT^ji) + 1 < 2Z < mT{ji + 1) — 1. Since 
C{mT{ji)) = C{ji), we have 



max 
I 



C{21) - C{1) < max|C(20 - C(mr(iO)l + max |C(jz) " C{1)\. 



Take p large enough s.t. = {T, sup^ | |n(^)| — + < plogn} satisfies P„(f^n) 1 (see 
Lemma 7). On n!^, \mT{ji) - 2l\ < \mT{ji) - mT{ji + 1)| < \H{ji) - H{ji + 1)| + 2 < plogra + 2 
and then 

max |C(20 -CfmTO'O) I < (plogn + 2) sup |C(0 - C(i + 1)| (44) 
' I 

since for any i, \C{i) — C{i + 1)| = \C{j) — C{j + 1)| for some j (among FT{i) and Ft(z + 1), one is 

the father of the other one). Now, \ji — /| < maxj \HT{i)\, and then since h„ h, for any e > 0, 

n 

n-V2--sup|j,-/|^^0. (45) 
I " 

Assume that c„ c. By (45), k{n)~^ max/ \C{ji) — C{1)\ > o, and then ||c„ — Cn\\oo ' 0. 

n n n 

If Cn — > c then A;~^(n)max/ \C{21) — C{mT{ji))\ > 0; on the other hand, max; \C{ji) — 

n n 

C{1)\ = max; \C{mT{3i)) - C{mT{l))\. Since max; \mT{ji) - mT{l)\ < 2max; \ji - l\ + 2max; - 
Hi\, 77,"^/^"^ max; \mT{ji) — '^^(01 ^^°''"") 0. Hence max; \C{ji) — C{l)\/k{n) > Q and then 

n n 

II _ ^ II proba- p, PI 
n 
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